Sparse Quadratic Logistic Regression in Sub-quadratic Time
نویسندگان
چکیده
We consider support recovery in the quadratic logistic regression setting – where the target depends on both p linear terms xi and up to p quadratic terms xixj . Quadratic terms enable prediction/modeling of higher-order effects between features and the target, but when incorporated naively may involve solving a very large regression problem. We consider the sparse case, where at most s terms (linear or quadratic) are non-zero, and provide a new faster algorithm. It involves (a) identifying the weak support (i.e. all relevant variables) and (b) standard logistic regression optimization only on these chosen variables. The first step relies on a novel insight about correlation tests in the presence of non-linearity, and takes O(pn) time for n samples – giving potentially huge computational gains over the naive approach. Motivated by insights from the boolean case, we propose a non-linear correlation test for non-binary finite support case that involves hashing a variable and then correlating with the output variable. We also provide experimental results to demonstrate the effectiveness of our methods.
منابع مشابه
Efficient evaluation of scaled proximal operators
Quadratic-support functions [Aravkin, Burke, and Pillonetto; J. Mach. Learn. Res. 14(1), 2013] constitute a parametric family of convex functions that includes a range of useful regularization terms found in applications of convex optimization. We show how an interior method can be used to efficiently compute the proximal operator of a quadratic-support function under different metrics. When th...
متن کاملA multilevel framework for sparse optimization with application to inverse covariance estimation and logistic regression
Solving l1 regularized optimization problems is common in the fields of computational biology, signal processing and machine learning. Such l1 regularization is utilized to find sparse minimizers of convex functions. A well-known example is the LASSO problem, where the l1 norm regularizes a quadratic function. A multilevel framework is presented for solving such l1 regularized sparse optimizati...
متن کاملGini Support Vector Machine: Quadratic Entropy Based Robust Multi-Class Probability Regression
Many classification tasks require estimation of output class probabilities for use as confidence scores or for inference integrated with other models. Probability estimates derived from large margin classifiers such as support vector machines (SVMs) are often unreliable. We extend SVM large margin classification to GiniSVM maximum entropy multi-class probability regression. GiniSVM combines a q...
متن کاملOptimizing Revenue over Data-driven Assortments
We revisit the problem of assortment optimization under the multinomial logit choice model with general constraints and propose new efficient optimization algorithms. Our algorithms do not make any assumptions on the structure of the feasible sets and in turn do not require a compact representation of constraints describing them. For the case of cardinality constraints, we specialize our algori...
متن کاملSupplement Materials for “ An Improved GLMNET for L 1 - regularized Logistic Regression ”
This document presents some materials not included in the paper. In Section II, we show that the solution of subproblem (13) converges to zero. In Section III, we show that newGLMNET has quadratic convergence if the loss function L(·) is strictly convex and the exact Hessian is used as H in the quadratic sub-problem. In Section IV, we show that newGLMNET terminates in finite iterations even wit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1703.02682 شماره
صفحات -
تاریخ انتشار 2017